Search CORE

37 research outputs found

Using Robust PCA to estimate regional characteristics of language use from geo-tagged Twitter messages

Author: Barankai Norbert
Csabai István
Dobos László
Hanyecz Tamás
Kallus Zsófia
Kondor Dániel
Sebők Tamás
Szüle János
Vattay Gábor
Publication venue
Publication date: 01/01/2013
Field of study

Principal component analysis (PCA) and related techniques have been successfully employed in natural language processing. Text mining applications in the age of the online social media (OSM) face new challenges due to properties specific to these use cases (e.g. spelling issues specific to texts posted by users, the presence of spammers and bots, service announcements, etc.). In this paper, we employ a Robust PCA technique to separate typical outliers and highly localized topics from the low-dimensional structure present in language use in online social networks. Our focus is on identifying geospatial features among the messages posted by the users of the Twitter microblogging service. Using a dataset which consists of over 200 million geolocated tweets collected over the course of a year, we investigate whether the information present in word usage frequencies can be used to identify regional features of language use and topics of interest. Using the PCA pursuit method, we are able to identify important low-dimensional features, which constitute smoothly varying functions of the geographic location

arXiv.org e-Print Archive

CiteSeerX

Crossref

Inkrementalizmus és megszakított egyensúly a magyar költségvetés teljesülésében (1991-2013)

Author: Berki Tamás
Sebők Miklós
Publication venue: Magyar Tudományos Akadémia Társadalomtudományi Kutatóközpont Politikatudományi Intézet
Publication date: 01/01/2018
Field of study

Repository of the Academy's Library

Race, Religion and the City: Twitter Word Frequency Patterns Reveal Dominant Demographic Dimensions in the United States

Author: Bokányi Eszter
Csabai István
Dobos László
Kondor Dániel
Sebők Tamás
Stéger József
Vattay Gábor
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Recently, numerous approaches have emerged in the social sciences to exploit the opportunities made possible by the vast amounts of data generated by online social networks (OSNs). Having access to information about users on such a scale opens up a range of possibilities, all without the limitations associated with often slow and expensive paper-based polls. A question that remains to be satisfactorily addressed, however, is how demography is represented in the OSN content? Here, we study language use in the US using a corpus of text compiled from over half a billion geo-tagged messages from the online microblogging platform Twitter. Our intention is to reveal the most important spatial patterns in language use in an unsupervised manner and relate them to demographics. Our approach is based on Latent Semantic Analysis (LSA) augmented with the Robust Principal Component Analysis (RPCA) methodology. We find spatially correlated patterns that can be interpreted based on the words associated with them. The main language features can be related to slang use, urbanization, travel, religion and ethnicity, the patterns of which are shown to correlate plausibly with traditional census data. Our findings thus validate the concept of demography being represented in OSN language use and show that the traits observed are inherently present in the word frequencies without any previous assumptions about the dataset. Thus, they could form the basis of further research focusing on the evaluation of demographic data estimation from other big data sources, or on the dynamical processes that result in the patterns found here

arXiv.org e-Print Archive

Repository of the Academy's Library

ELTE Digital Institutional Repository (EDIT)

Webcam based analysis of facial expressions

Author: Nagy Tamás
Sebők Judit
Publication venue
Publication date: 01/01/2012
Field of study

University of Szeged

The effect of central bank communication on sovereign bond yields: The case of Hungary

Author: Barczikay Tamás
Máté Ákos
Sebők Miklós
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2021
Field of study

In this article we investigate how the public communication of the Hungarian Central Bank's Monetary Council (MC) affects Hungarian sovereign bond yields. This research ties into the advances made in the financial and political economy literature which rely on extensive textual data and quantitative text analysis tools. While prior research demonstrated that forward guidance, in the form of council meeting minutes or press releases can be used as predictors of rate decisions, we are interested in whether they are able to directly influence asset returns as well. In order to capture the effect of central bank communication, we measure the latent hawkish or dovish sentiment of MC press releases from 2005 to 2019 by applying a sentiment dictionary, a staple in the text mining toolkit. Our results show that central bank forward guidance has an intra-year effect on bond yields. However, the hawkish or dovish sentiment of press releases has no impact on maturities of one year or longer where the policy rate proves to be the most important explanatory variable. Our research also contributes to the literature by applying a specialized dictionary to monetary policy as well as broadening the discussion by analyzing a case from the non-eurozone Central-Eastern region of the European Union

Directory of Open Access Journals

Repository of the Academy's Library

A multi-terabyte relational database for geo-tagged social network data

Author: Dániel Kondor
Gábor Vattay
István Csabai
János Szüle
József Stéger
László Dobos
Tamás Bodnár
Tamás Hanyecz
Tamás Sebők
Zsófia Kallus
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

ELTE Digital Institutional Repository (EDIT)

Using Robust PCA to estimate regional characteristics of language use from geo-tagged Twitter messages

Author: Dániel Kondor
Gábor Vattay
István Csabai
János Szüle
László Dobos
Norbert Barankai
Tamás Hanyecz
Tamás Sebők
Zsófia Kallus
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

ELTE Digital Institutional Repository (EDIT)

Regional properties of global communication as reflected in aggregated Twitter data

Author: Dániel Kondor
Gábor Vattay
István Csabai
János Szüle
József Stéger
László Dobos
Norbert Barankai
Tamás Hanyecz
Tamás Sebők
Zsófia Kallus
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

ELTE Digital Institutional Repository (EDIT)

Methanol oxidation catalyst by atomic layer deposition

Author: Ballai Gergő
Gyenes Tamás
Haspel Henrik
Kukovecz Ákos
Kónya Zoltán
Sebők Dániel
Szenti Imre
Vásárhelyi Lívia
Publication venue: 'University of Szeged'
Publication date: 01/01/2021
Field of study

Direct liquid fuel cells (DMFCs) are very appealing alternatives for fighting climate change, particularly in the field of personal mobility solutions. However, DMFCs also have some serious competitive disadvantages, like the high cost of the noble metal catalysts, the difficulties of the catalyst application, and the poisoning of the catalyst due to carbon monoxide formation. Here we demonstrate that depositing platinum on TiO2 by atomic layer deposition (ALD) is an easy, reproducible method for the synthesis of TiO2-supported platinum catalyst for methanol oxidation with excelent anti CO poisoning properties

University of Szeged